We start by reading in the raw CSV files for USD/JPY exchange rates, Gold futures, VIX index, and financial sentiment data. The exchange rate data was obtained from Macrotrends, whilst Gold data were obtained from Yahoo finance, the VIX data was obtained directly from CBOE, and the news sentiment data was obtained straight from the San Francisco Fed. The data were initially gathered in March 2025, with a cutoff data of 15 March. We’ll use these data for our analyses and modeling, and if so inclined, we can use newer data to evaluate the models we fit. After downloading the raw data, I took a look at them in Excel and did some preliminary cleaning and sorting, though more manipulation must be done later on, which we’ll see in this notebook.
At this point, we have no idea how any of these series might be related or might behave; I selected these series as I thought they might be related to the VIX due to the JPY being a safe haven currency, Gold being a safe haven asset, and news sentiment data potentially being a proxy for market sentiment, which would directly influence market volatility.
# data ranges from 01-02-1990 to 03-15-2025
# read USDJPY.csv, GOLD.csv, and VIX.csv
jpy <- read.csv("data/march/USDJPY.csv")
gold <- read.csv("data/march/GOLD.csv")
vix <- read.csv("data/march/VIX.csv")
# read sentiment data
fin_sen <- read.csv("sentiment/news_sentiment_data.csv")
fin_sen$date <- as.Date(fin_sen$date, format = "%Y-%m-%d")
head(fin_sen)
## date News.Sentiment
## 1 1980-01-01 -0.04
## 2 1980-01-02 -0.11
## 3 1980-01-03 -0.09
## 4 1980-01-05 -0.07
## 5 1980-01-06 -0.09
## 6 1980-01-07 -0.13
head(vix)
## Date VIX.OPEN VIX.HIGH VIX.LOW VIX.Close
## 1 3/14/2025 24.35 24.36 21.48 21.77
## 2 3/13/2025 24.92 26.13 23.46 24.66
## 3 3/12/2025 26.88 26.91 23.89 24.23
## 4 3/11/2025 27.94 29.57 26.18 26.92
## 5 3/10/2025 24.70 29.56 24.68 27.86
## 6 3/7/2025 24.85 26.56 23.09 23.37
Here, we add a ‘Next.Close’ column by shifting the VIX close values by one day. Originally, I did this to check for potential relationships between the various “predictors” we’ll be looking at on the next day’s VIX close. Ultimately, though, as I didn’t find the Gold data to be usable, I settled on just using the same day values to use in a vector autoregression (VAR) model for now.
# IMPORTANT DON'T FORGET ABOUT THIS
# ADD COLUMN NEXT.CLOSE BY SHIFTING VALUES FROM VIX.CLOSE BY 1 DAY
# MOVE COLUMN 5 DOWN BY 1 ROW, EXCEPT FOR THE FIRST ROW, MUTATE TO COL 6
vix <- vix %>%
mutate(Next.Close = c(NA, vix$VIX.Close[-nrow(vix)]))
head(vix)
## Date VIX.OPEN VIX.HIGH VIX.LOW VIX.Close Next.Close
## 1 3/14/2025 24.35 24.36 21.48 21.77 NA
## 2 3/13/2025 24.92 26.13 23.46 24.66 21.77
## 3 3/12/2025 26.88 26.91 23.89 24.23 24.66
## 4 3/11/2025 27.94 29.57 26.18 26.92 24.23
## 5 3/10/2025 24.70 29.56 24.68 27.86 26.92
## 6 3/7/2025 24.85 26.56 23.09 23.37 27.86
Here, we’re narrowing down exactly what data we need; I don’t need daily highs, lows, or any other stuff for most of these data, so we’ll just take the daily close values (or volumes) and dates. We’ll clean the data briefly by dropping missing entries, then merge the datasets into a single dataframe for ease of use.
jpy <- jpy[-1, c(1, 2)]
gold <- gold[-1, c(1, 2, 6)]
vix <- vix[-1, c(1, 5, 6)]
vix_df <- reduce(list(jpy, gold, vix), full_join, by = "Date")
vix_clean <- na.omit(vix_df)
vix_clean[,1] <- as.Date(vix_clean[,1], format = "%m/%d/%Y")
names(vix_clean)[2] <- "USD.JPY"
vix_clean[,4] <- as.numeric(as.character(vix_clean[,4]))
## Warning: NAs introduced by coercion
# head(vix_clean)
# join vix_clean and fin_sen on date
vix_clean <- vix_clean %>%
left_join(fin_sen, by = c("Date" = "date"))
# drop rows with missing Gold.Volume, Gold.Price, and USD.JPY
vix_clean <- vix_clean %>%
filter(!is.na(Gold.Volume) & !is.na(Gold.Price) & !is.na(USD.JPY))
vix_clean <- vix_clean %>%
# sort by date, ascending
arrange(Date) %>%
mutate(Day = row_number()) # new tool for indexing
# sort by day, ascending
vix_clean <- vix_clean %>%
arrange(Day)
head(vix_clean)
## Date USD.JPY Gold.Price Gold.Volume VIX.Close Next.Close News.Sentiment
## 1 1992-12-30 124.55 333.3 1760 12.60 12.57 0.23
## 2 1992-12-31 124.80 333.1 20180 12.57 13.36 0.24
## 3 1993-01-04 125.30 328.4 14850 13.36 13.35 0.27
## 4 1993-01-05 124.80 329.0 20380 13.35 13.37 0.28
## 5 1993-01-06 125.09 330.1 18110 13.37 14.72 0.27
## 6 1993-01-07 125.18 329.0 7850 14.72 13.77 0.26
## Day
## 1 1
## 2 2
## 3 3
## 4 4
## 5 5
## 6 6
tail(vix_clean)
## Date USD.JPY Gold.Price Gold.Volume VIX.Close Next.Close
## 8027 2025-03-06 147.95 2926.6 166360 24.87 23.37
## 8028 2025-03-07 148.03 2914.1 237580 23.37 27.86
## 8029 2025-03-10 147.26 2899.4 221610 27.86 26.92
## 8030 2025-03-11 147.77 2920.9 194120 26.92 24.23
## 8031 2025-03-12 148.25 2946.8 209820 24.23 24.66
## 8032 2025-03-13 147.81 2991.3 256560 24.66 21.77
## News.Sentiment Day
## 8027 0.02 8027
## 8028 0.00 8028
## 8029 -0.04 8029
## 8030 -0.05 8030
## 8031 -0.07 8031
## 8032 -0.08 8032
We’ll also adjust the frequency of our data, due to the following principle: a time series sampled too frequently will always resemble a random walk, whereas a time series sampled too infrequently will resemble a white noise process. In this case, the VIX data was originally sampled daily, but we can adjust it to take every 5th observation to avoid the random walk effect. If I wanted to be more meticulous, I could actually adjust it by date to get the end of each business week instead, but accounting for various statutory holidays or other days on which exceptional things happened would have been a bit of a hassle, so I’ve skipped over that here and instead just took every Kth observation, where K = 5 by default.
# FREQUENCY TWEAK, DATA WAS ORIGINALLY DAILY BUT WE CAN CHANGE THAT HERE
k <- 5 # TAKE EVERY KTH OBSERVATION
# BECAUSE K=1 RESEMBLED A RANDOM WALK AND WAS OF NO USE
# adjust frequency in the event it resembles a random walk
# try every kth observation
vix_clean <- vix_clean[seq(1, nrow(vix_clean), by = k), ]
# write.csv(vix_clean, "vix_clean_by_5.csv", row.names=FALSE)
We’ll start with some exploratory plots of each of the time series to get a sense of how they behave. At a glance, we’re able to see that Gold Futures Prices and the USD-JPY exchange rate are definitely non-stationary, as they’ve consistently come up over the years. Perhaps differencing would help here, but for now, let’s focus on the others.
# just look at the VIX for now
vix_ts <- ts(vix_clean$VIX.Close)
gold_fut_ts <- ts(vix_clean$Gold.Price)
gold_vol_ts <- ts(vix_clean$Gold.Volume)
sentiment_ts <- ts(vix_clean$News.Sentiment)
exch_ts <- ts(vix_clean$USD.JPY)
plot(vix_ts, main = "VIX Time Series")
plot(diff(vix_ts), main = "VIX, 1st Difference")
plot(gold_fut_ts, main = "Gold Futures Price")
gold_fut_diff <- diff(gold_fut_ts)
plot(gold_fut_diff, main = "Gold Futures Price, 1st Difference")
plot(exch_ts, main = "USD-JPY Exchange Rate")
abline(h = mean(exch_ts), col = 'red')
plot(diff(exch_ts), main = "USD-JPY Exchange Rate, 1st Difference")
summary(exch_ts)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 76.07 103.53 110.17 111.41 119.30 161.45
Let’s start with a peek at the time series data for Gold futures volumes. We immediately see that the data for the later observations seem to be problematic, with a lot of values dropping to zero. As a result, this data likely won’t be of use alongside the VIX data we’ve collected. Instead, we’ll look at it individually. We’ll drop the problematic data so we can have a better idea of the actual behaviour of this time series.
plot(gold_vol_ts, main = "Gold Volume")
# only keep the first 1200 observations
gold_vol_ts <- gold_vol_ts[1:1200]
plot(gold_vol_ts, main = "Gold Volume, First 1200 Observations")
plot(diff(gold_vol_ts), main = "Gold Volume, 1st Difference")
We’ll next take a look at the autocorrelation function (ACF), partial autocorrelation function (PACF), and extended ACF (EACF) for the Gold futures series and its first difference. This will help us understand the underlying structure of the time series and guide our ARIMA modeling.
acf(gold_fut_ts)
pacf(gold_fut_ts)
eacf(gold_fut_ts)
## AR/MA
## 0 1 2 3 4 5 6 7 8 9 10 11 12 13
## 0 x x x x x x x x x x x x x x
## 1 o o x o x o o o o x o x o o
## 2 x o o o o o o o o x o x o o
## 3 x x o x o o o o o o o x o o
## 4 x x x o o o o o o o o o o o
## 5 x x x x x o x o o o o o o o
## 6 x x x x x o o o o o o o o o
## 7 x x o x x x o o o o o o o o
acf(gold_fut_diff)
pacf(gold_fut_diff)
eacf(gold_fut_diff)
## AR/MA
## 0 1 2 3 4 5 6 7 8 9 10 11 12 13
## 0 o o x o x o o o o x o x o o
## 1 x o o o o o o o o x o x o o
## 2 x x o o o o o o o o o x o o
## 3 x x x o o o o o o o o o o o
## 4 x x x x x o o o o o o o o o
## 5 x x x x x o o o o o o o o o
## 6 x x o x x x o o o o o o o o
## 7 x x o x x x o o o o o o o o
That ACF for the base time series doesn’t look so hot, but the PACF indicates that we should only really look at the behaviour at lag one. The EACF for the regular time series suggests looking more carefully at an AR(1); the differenced data suggests a different story. Despite the spikes outside of the 2 CI limit at lags 5, 10, 12, 15, etc, I’d prefer simply to look more closely at the behaviour at lag three there. However, due to the EACF for the differenced time series, perhaps it’s best not to focus too much on this for the time being, and instead focus more on our main topic at hand: the VIX. Regardless, let’s still fit a simple model on these data, and we can return to it later if so inclined.
We’ll fit an ARIMA model to the Gold futures series, starting with the first difference to ensure stationarity. We’ll also check the residuals for normality using a QQ plot and polynomial roots to assess the stability of the model.
gold_fut_model <- forecast::Arima(gold_fut_ts, order = c(1, 1, 0))
summary(gold_fut_model)
## Series: gold_fut_ts
## ARIMA(1,1,0)
##
## Coefficients:
## ar1
## -0.0386
## s.e. 0.0249
##
## sigma^2 = 729.9: log likelihood = -7572.46
## AIC=15148.91 AICc=15148.92 BIC=15159.68
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 1.688794 27.00046 16.99941 0.1126406 1.679245 0.9996684
## ACF1
## Training set -0.004010365
polyroot(gold_fut_model$coef)
## complex(0)
qqnorm(residuals(gold_fut_model), main = "Gold Futures Model Residuals QQ")
And there we have it: when fitting an ARI(1,1) on the regular Gold futures data, we note that the coefficient of our AR(1) term is within 2 standard errors of 0. Furthermore, our Q-Q plot is imperfect, so let’s not use this for now.
We’ll now do something similar for the Gold volume and news sentiment series, focusing on the latter due to the issues mentioned previously. We’ll plot the time series, check for stationarity, and examine the ACF, PACF, and EACF to understand their structure.
plot(sentiment_ts, main = "News Sentiment")
abline(h = mean(sentiment_ts), col = 'red')
sentiment_diff <- diff(sentiment_ts)
plot(sentiment_diff, main = "News Sentiment, 1st Difference")
acf(sentiment_diff)
pacf(sentiment_diff)
eacf(sentiment_diff)
## AR/MA
## 0 1 2 3 4 5 6 7 8 9 10 11 12 13
## 0 x o o o o o o o o o o o o o
## 1 x o o o o o o o o o o o o o
## 2 x o o o o o o o o o o o o o
## 3 x x x o o o o o o o o o o o
## 4 x x x o o o o o o o o o o o
## 5 x o x o o o o o o o o o o o
## 6 x o x x x x o o o o o o o o
## 7 x x x x x x x o o o o o o o
The data for news sentiment looks surprisingly good! Perhaps this may be of use to us. From the EACF, it looks like we’d be able to try a variety of MA or ARIMA models.
Let’s take a look at the characteristics of the raw VIX time series, including its ACF, PACF, and EACF. This will help us determine the appropriate parameters for our ARIMA model. We’ll also check the stationarity of the series and visualize it.
plot(vix_ts, main = "VIX Time Series")
acf(vix_ts)
eacf(vix_ts)
## AR/MA
## 0 1 2 3 4 5 6 7 8 9 10 11 12 13
## 0 x x x x x x x x x x x x x x
## 1 x o o x o o o o o x o o o o
## 2 x o o x o o o o o o o o o o
## 3 o o x x o o o o o o o o o o
## 4 o x x o o o o o o o o o o o
## 5 x x x x o o o o o o o o o o
## 6 x x x x x o o o o o o o o o
## 7 x x x x x x o o o o o o o o
pacf(vix_ts)
From the first plot, we’re immediately able to notice that there are massive spikes in the VIX corresponding to times of major market events, such as various financial crises (the internet bubble, 2008, COVID-19, etc). The decay in the ACF suggests an AR component, as does the EACF. Looking at the PACF, we see two significant lags we should consider, with smaller peaks at lags 5 and beyond, which can be ignored.
Here, I also declare and set parameters for the ARIMA model,
including the order of differencing and the AR and MA terms, which make
their way into the tsorder vector. I also set variables for
the name of the model (for plotting), as well as the number of
historical observations (and forecasted values) I’d like to see in the
plots I eventually make.
Using the dynamics method of model fitting, I settled on usage of an ARIMA(2, 1, 1) model for the VIX time series. This was determined by examining the ACF and PACF plots, as well as the EACF.
p <- 2
d <- 1
q <- 1
tsorder <- c(p,d,q)
model <- paste("ARIMA(", p, ",", d, ",", q, ")", sep = "")
hist_length <- 60
forecast_length <- 20
VIX_model <- forecast::Arima(vix_ts, order = tsorder)
VIX_model
## Series: vix_ts
## ARIMA(2,1,1)
##
## Coefficients:
## ar1 ar2 ma1
## 0.7841 0.1036 -0.9759
## s.e. 0.0273 0.0260 0.0107
##
## sigma^2 = 9.42: log likelihood = -4078.56
## AIC=8165.12 AICc=8165.14 BIC=8186.64
polyroot(VIX_model$coef)
## [1] 0.9510123+2.908057e-25i -0.8448399-2.908057e-25i
qqnorm(residuals(VIX_model), main = "VIX Model Residuals QQ Plot")
Our roots seem to be okay, and the Q-Q plot looks good enough, though it trails off at the ends as per usual. This model may be usable.
I was also somewhat morbidly curious about what R would automatically
fit to the VIX time series, so I ran the auto.arima
function on it. I would never actually use this in practice, as I prefer
to have more control over the model parameters, but it was interesting
to see what R would choose.
# check what R automatically fits to vix_ts
auto_fit <- auto.arima(vix_ts)
summary(auto_fit)
## Series: vix_ts
## ARIMA(2,0,0) with non-zero mean
##
## Coefficients:
## ar1 ar2 mean
## 0.8093 0.1240 19.5059
## s.e. 0.0247 0.0247 1.1357
##
## sigma^2 = 9.411: log likelihood = -4081.06
## AIC=8170.12 AICc=8170.15 BIC=8191.65
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.006030498 3.064877 1.944139 -1.728277 9.559953 0.9820953
## ACF1
## Training set 0.001300072
polyroot(auto_fit$coef)
## [1] -0.003179542+0.2036651i -0.003179542-0.2036651i
qqnorm(residuals(auto_fit), main = "VIX Auto ARIMA Residuals QQ Plot")
Surprisingly, auto.arima returned an AR(2) model, which
is more reasonable than I was expecting (I’ve had it spit out some
horrible complex models when fitting COVID data in the past!). However,
I’m still more comfortable with the model I fit myself.
Next, I applied the ARIMA model to generate forecasts, using the forecast length from above. The plot is made using information from the same code block above (title, number of historical observations, number of predictions, etc).
VIX_model_forecast <- forecast(VIX_model, h = forecast_length)
last_day <- max(vix_clean$Day)
forecast_df <- data.frame(
Day = seq(from = last_day + k, by = k, length.out = forecast_length),
Forecast = as.numeric(VIX_model_forecast$mean),
Lo95 = VIX_model_forecast$lower[,2],
Hi95 = VIX_model_forecast$upper[,2]
)
tail_VIX <- tail(vix_clean, hist_length)
combined_plot <- plot_ly() %>%
add_trace(
data = tail_VIX,
x = ~Day,
y = ~Next.Close,
type = 'scatter',
mode = 'lines',
name = 'Historical VIX',
line = list(width = 2, color = 'blue')
) %>%
add_trace(
data = forecast_df,
x = ~Day,
y = ~Forecast,
type = 'scatter',
mode = 'lines',
name = 'Forecast',
line = list(width = 2, dash = 'dash', color = 'red')
) %>%
add_ribbons(
data = forecast_df,
x = ~Day,
ymin = ~Lo95,
ymax = ~Hi95,
name = "95% CI",
fillcolor = 'rgba(135,206,250,0.3)',
line = list(color = 'rgba(135,206,250,0.1)')
) %>%
layout(
title = paste("VIX: Last", hist_length, "Observations &", forecast_length, "Day", model, "Forecast"),
xaxis = list(title = "Days Since Start"),
yaxis = list(title = "VIX"),
showlegend = TRUE
)
combined_plot
Next, we’ll fit a VAR model to see if there are any interactions between VIX and news sentiment, then make and plot forecasts similarly to what we just did with the ARIMA model.
vix_sentiment_ts <- ts(vix_clean[, c("VIX.Close", "News.Sentiment")])
var_model <- VAR(vix_sentiment_ts, lag = 2)
summary(var_model)
##
## VAR Estimation Results:
## =========================
## Endogenous variables: VIX.Close, News.Sentiment
## Deterministic variables: const
## Sample size: 1605
## Log Likelihood: -1196.084
## Roots of the characteristic polynomial:
## 0.9676 0.923 0.1137 0.08488
## Call:
## VAR(y = vix_sentiment_ts, lag.max = 2)
##
##
## Estimation results for equation VIX.Close:
## ==========================================
## VIX.Close = VIX.Close.l1 + News.Sentiment.l1 + VIX.Close.l2 + News.Sentiment.l2 + const
##
## Estimate Std. Error t value Pr(>|t|)
## VIX.Close.l1 0.80835 0.02498 32.364 < 2e-16 ***
## News.Sentiment.l1 0.49117 1.86670 0.263 0.792
## VIX.Close.l2 0.11733 0.02524 4.648 3.63e-06 ***
## News.Sentiment.l2 -1.12935 1.84069 -0.614 0.540
## const 1.48236 0.24848 5.966 2.99e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 3.069 on 1600 degrees of freedom
## Multiple R-Squared: 0.8567, Adjusted R-squared: 0.8563
## F-statistic: 2391 on 4 and 1600 DF, p-value: < 2.2e-16
##
##
## Estimation results for equation News.Sentiment:
## ===============================================
## News.Sentiment = VIX.Close.l1 + News.Sentiment.l1 + VIX.Close.l2 + News.Sentiment.l2 + const
##
## Estimate Std. Error t value Pr(>|t|)
## VIX.Close.l1 -0.0027449 0.0003312 -8.288 2.41e-16 ***
## News.Sentiment.l1 1.0534522 0.0247519 42.560 < 2e-16 ***
## VIX.Close.l2 0.0020751 0.0003347 6.199 7.19e-10 ***
## News.Sentiment.l2 -0.0934251 0.0244071 -3.828 0.000134 ***
## const 0.0145050 0.0032948 4.402 1.14e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 0.04069 on 1600 degrees of freedom
## Multiple R-Squared: 0.9554, Adjusted R-squared: 0.9553
## F-statistic: 8567 on 4 and 1600 DF, p-value: < 2.2e-16
##
##
##
## Covariance matrix of residuals:
## VIX.Close News.Sentiment
## VIX.Close 9.41661 -0.016664
## News.Sentiment -0.01666 0.001656
##
## Correlation matrix of residuals:
## VIX.Close News.Sentiment
## VIX.Close 1.0000 -0.1335
## News.Sentiment -0.1335 1.0000
var_forecast <- predict(var_model, n.ahead = forecast_length)
var_forecast_df <- data.frame(
Day = seq(from = last_day + k, by = k, length.out = forecast_length),
VIX_Forecast = as.numeric(var_forecast$fcst$VIX.Close[,1]),
Sentiment_Forecast = as.numeric(var_forecast$fcst$News.Sentiment[,1]),
VIX_Lo95 = var_forecast$fcst$VIX.Close[,2],
VIX_Hi95 = var_forecast$fcst$VIX.Close[,3],
Sentiment_Lo95 = var_forecast$fcst$News.Sentiment[,2],
Sentiment_Hi95 = var_forecast$fcst$News.Sentiment[,3]
)
var_combined_plot <- plot_ly() %>%
add_trace(
data = tail_VIX,
x = ~Day,
y = ~Next.Close,
type = 'scatter',
mode = 'lines',
name = 'Historical VIX',
line = list(width = 2, color = 'blue')
) %>%
add_trace(
data = var_forecast_df,
x = ~Day,
y = ~VIX_Forecast,
type = 'scatter',
mode = 'lines',
name = 'VAR VIX Forecast',
line = list(width = 2, dash = 'dash', color = 'red')
) %>%
add_trace(
data = var_forecast_df,
x = ~Day,
y = ~Sentiment_Forecast,
type = 'scatter',
mode = 'lines',
name = 'VAR Sentiment Forecast',
line = list(width = 2, dash = 'dash', color = 'green')
) %>%
add_ribbons(
data = var_forecast_df,
x = ~Day,
ymin = ~VIX_Lo95,
ymax = ~VIX_Hi95,
name = "VAR VIX 95% CI",
fillcolor = 'rgba(135,206,250,0.3)',
line = list(color = 'rgba(135,206,250,0.1)')
) %>%
add_ribbons(
data = var_forecast_df,
x = ~Day,
ymin = ~Sentiment_Lo95,
ymax = ~Sentiment_Hi95,
name = "VAR Sentiment 95% CI",
fillcolor = 'rgba(144,238,144,0.3)',
line = list(color = 'rgba(144,238,144,0.1)')
) %>%
layout(
title = "VAR Model Forecasts",
xaxis = list(title = "Days Since Start"),
yaxis = list(title = "Values"),
showlegend = TRUE
)
var_combined_plot
Finally, as a direct comparison, we’ll examine the widths of the 95% confidence bands between the ARIMA and VAR model forecasts.
arima_pred <- c(as.numeric(VIX_model_forecast$mean))
ci_arimalo <- c(VIX_model_forecast$lower[,2])
ci_arimahi <- c(VIX_model_forecast$upper[,2])
ci_arima <- as.data.frame(arima_pred) %>%
mutate(ci_arimalo = ci_arimalo) %>%
mutate(ci_arimahi = ci_arimahi) %>%
mutate(alpha_band = arima_pred - ci_arimalo)
ci_arima
## arima_pred ci_arimalo ci_arimahi alpha_band
## 1 23.35733 17.341861 29.37280 6.015470
## 2 22.91139 15.176960 30.64582 7.734430
## 3 22.47131 13.482441 31.46018 8.988869
## 4 22.08004 12.141362 32.01872 9.938679
## 5 21.72765 11.034986 32.42032 10.692665
## 6 21.41080 10.103641 32.71797 11.307163
## 7 21.12585 9.307990 32.94372 11.817863
## 8 20.86960 8.620778 33.11841 12.248817
## 9 20.63914 8.022033 33.25625 12.617108
## 10 20.43189 7.496581 33.36720 12.935310
## 11 20.24551 7.032577 33.45844 13.212933
## 12 20.07790 6.620578 33.53521 13.457318
## 13 19.92716 6.252934 33.60138 13.674225
## 14 19.79160 5.923363 33.65984 13.868237
## 15 19.66969 5.626653 33.71273 14.043039
## 16 19.56006 5.358436 33.76168 14.201622
## 17 19.46146 5.115027 33.80790 14.346436
## 18 19.37280 4.893294 33.85230 14.479502
## 19 19.29306 4.690560 33.89555 14.602497
## 20 19.22135 4.504527 33.93817 14.716820
var_pred <- c(as.numeric(var_forecast$fcst$VIX.Close[,1]))
ci_varlo <- c(var_forecast$fcst$VIX.Close[,2])
ci_varhi <- c(var_forecast$fcst$VIX.Close[,3])
ci_var <- as.data.frame(var_pred) %>%
mutate(ci_varlo = ci_varlo) %>%
mutate(ci_varhi = ci_varhi) %>%
mutate(alpha_band = var_pred - ci_varlo)
ci_var
## var_pred ci_varlo ci_varhi alpha_band
## 1 23.57346 17.559019 29.58790 6.014442
## 2 23.41913 15.688589 31.14967 7.730539
## 3 23.23314 14.222188 32.24409 9.010952
## 4 23.06349 13.066238 33.06075 9.997256
## 5 22.90296 12.111970 33.69394 10.790987
## 6 22.75155 11.307011 34.19609 11.444541
## 7 22.60861 10.617202 34.60001 11.991405
## 8 22.47358 10.019198 34.92796 12.454383
## 9 22.34596 9.496072 35.19586 12.849893
## 10 22.22528 9.035072 35.41550 13.190212
## 11 22.11110 8.626307 35.59589 13.484791
## 12 22.00300 8.261932 35.74407 13.741067
## 13 21.90060 7.935610 35.86560 13.964993
## 14 21.80356 7.642152 35.96497 14.161407
## 15 21.71154 7.377255 36.04582 14.334282
## 16 21.62423 7.137316 36.11114 14.486911
## 17 21.54135 6.919295 36.16340 14.622050
## 18 21.46262 6.720605 36.20464 14.742019
## 19 21.38782 6.539035 36.23660 14.848781
## 20 21.31669 6.372681 36.26070 14.944009
We see that while the confidence intervals generated by the VAR model start off marginally tighter than those of the ARIMA model, as the forecast length increases, the VAR model’s confidence intervals widen significantly more than those of the ARIMA model, suggesting that the VAR model may be less reliable for longer-term forecasts, at least in this case.
I intend on gathering and using the data from March 2025 to the present to evaluate the goodness of these models, which may be difficult considering the votality caused by the current political environment in the US.
I’d like to try some other models in the future. I was recommended to study stochastic differential equations for continuous modeling, which I could then use to simulate discrete time steps. Upon further research, I also found that GARCH models might be useful for modeling volatility in financial time series, which could be particularly relevant to a study of the VIX.